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Abstract 

Pattern matching has proved an extremely powerful and durable notion in functional 
programming. This paper contributes a new programming notation for type theory which 
elaborates the notion in various ways. 

Firstly, as is by now quite well-known in the type theory community, definition by 
pattern matching becomes a more discriminating tool in the presence of dependent types, 
since it refines the explanation of types as well as values. This becomes all the more true 
in the presence of the rich class of datatypes known as inductive families (Dybjer, 1991). 

Secondly, as proposed by Peyton Jones (Peyton Jones, 1997) for Haskell, and indepen¬ 
dently rediscovered by us, subsidiary case analyses on the results of intermediate com¬ 
putations, which commonly take place on the right-hand side of definitions by pattern 
matching, should rather be handled on the left. In simply-typed languages, this subsumes 
the trivial case of Boolean guards; in our setting it becomes yet more powerful. 

Thirdly, elementary pattern matching decompositions have a well-defined interface given 
by a dependent type; they correspond to the statement of an induction principle for the 
datatype. More general, user-definable decompositions may be defined which also have 
types of the same general form. Elementary pattern matching may therefore be recast 
in abstract form, with a semantics given by translation. Such abstract decompositions 
of data generalize Wadler’s notion of ‘view’ (Wadler, 1987). The programmer wishing to 
introduce a new view of a type T, and exploit it directly in pattern matching, may do 
so via a standard programming idiom. The type theorist, looking through the Curry- 
Howard lens, may see this as proving a theorem, one which establishes the validity of a 
new induction principle for T. 

We develop enough syntax and semantics to account for this high-level style of pro¬ 
gramming in dependent type theory. It culminates in the development of a typechecker 
for the simply-typed lambda calculus, which furnishes a view of raw terms as either being 
well-typed, or containing an error. The implementation of this view is ipso facto a proof 
that typechecking is decidable. 


1 Introduction 

This paper is a contribution to declarative programming, in that it introduces a 
new high-level notation for functional programming on top of an existing low-level 
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dependent type theory. In particular, we offer a powerful and abstract successor to 
pattern matching, as conceived by Rod Burstall (Burstall, 1969) and, to our knowl¬ 
edge, first implemented in Fred McBride’s extension of LISP (McBride, 1970). 

The key feature of pattern matching in simply typed languages is that the struc¬ 
ture of an arbitrary value in a datatype is explained. Classically, pattern matching 
analyses constructor patterns on the left-hand sides of functional equations, and 
is defined by a subsystem of the operational semantics with hard-wired rules for 
computing substitutions from the pattern variables to values. For example, in Stan¬ 
dard ML (Milner et al, 1997), one might test list membership as follows: 


fun elem k [] = false 

I elem k (1 :: Is) = if (k = 1) then true else elem k Is 


The clarity of the code does not hinder its efficient compilation; a key technique here 
is Augustsson’s analysis in terms of hierarchical switching on the outermost con¬ 
structor symbol, coupled with the exposure of subexpressions (Augustsson, 1985). 
This yields, for elem above, the following cascade of case expressions: 


fun elem k Is = case Is 

of [] => true 

I 1 :: Is’ => case (k = 1) 

of true => true 
I false => elem k Is’ 

Pattern matching has proved such a powerful and durable notion in functional 
programming, that its further development has remained firmly on the research 
agenda. Peyton Jones’ idea of pattern guards (Peyton Jones, 1997; Peyton Jones & 
Erwig, 2000) allows definitions by pattern matching to handle on the left- hand side 
of programs, subsidiary analysis of the results of intermediate computations, which 
are more commonly, but “clunkily” ( loc.cit .), handled on the right. For elem, we 
can pull both tests to the left as follows: 

elem k [] = False 

elem k (1:1s) | True <- k == 1 = True 

elem k (1:1s) | False <- k == 1 = elem k Is 

Of course, Haskell’s Boolean guards (Peyton Jones & Hughes, 1999) can already 
qualify pattern matches by tests like k == 1, but pattern guards handle subcom¬ 
putations of more complex types. Further, the guard expression can be shared via 
a where clause and the layout rule. In our notation, you can achieve the same effect 
by grouping the two clauses in the scope of the call to k = l, as follows: 
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elem k [] i ■ pf??? false 
elem k (l :: Is) I k = l 

true true 

| false y^elem&Zs 

Dependent types add a descriptive and expressive power which makes pattern 
matching a more discriminating tool, refining types as well as values. Each el¬ 
ementary pattern matching decomposition has a well-defined interface given by a 
dependent type, corresponding to an induction principle for the datatype (Burstall, 
1969; Nordstrom et al, 1990). This insight flows from type theory’s interplay be¬ 
tween computation and reasoning—usually sloganised as the ‘Curry-Howard cor¬ 
respondence’, or ‘propositions-as-types’. The key feature of induction is that the 
result type is instantiated, and hence further explained, by the patterns. 

This observation bites all the more strongly in the presence of the rich class of 
datatypes known as inductive families (Dybjer, 1991). One such is So, a collection 
of types indexed by a Boolean value: 

data c \ where —r -p—-- 

- So b : * - oh : So true 

The point here is that So true has one element whilst So false has none. If p : So b, 
then ‘case’ on p tells us not only that p is oh, but also (‘/or free ’) that b must be 
true. Inspecting p can instantiate b and hence any type which depends on either! 

We can use So to impose Boolean ‘preconditions’ on programs. For example, a 
program which requires an argument p : So (testi or test 2 ) need only be defined 
under circumstances which make one of the test expressions evaluate to true. If 
such a program were to switch on the value of testi, say, we should somehow 
‘know’ that p : So true in the true case and that p : So test 2 otherwise, but how 
might a typechecker make this connection? Our | notation is motivated not just by 
convenience, but also to signal the abstraction of subcomputations from types. 

Meanwhile, Wadler’s ‘views’ proposal (Wadler, 1987; Burton et al., 1996) allows 
programmers to implement new schemes for decomposing values in types (abstract 
datatypes, especially), extending the syntax of matching correspondingly. In our 
setting, user-definable decompositions— elimination operators —may be speci¬ 
fied by types resembling the structural induction principles for datatypes, now the 
primitives from which higher-level analyses can be developed compositionally. 

Our notation gives a pattern-based syntax to programming with arbitrary elimina¬ 
tors; the semantics is given by translation, rather than ‘pattern matching’ per se. 
Further, we establish a standard idiom of first-order programming for equipping 
a type T with a new elimination operator, by identifying a set of patterns which 
cover the values in T; such patterns may now be arbitrary expressions of type T. 
The type theorist, looking through the Curry-Howard lens, may see this as proving 
a new induction principle for T. A similar idea has emerged recently in Voda’s 
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untyped first-order ‘Clausal Language’ (Voda, 2002), which admits new forms of 
case analysis via theorem-proving in Peano Arithmetic. 

Although the power of dependent types is widely acknowledged, sceptics rightly 
argue that expressibility is one thing and accessibility another. Programs should be 
read as well as written, often on the back of an envelope. Here, we address this issue 
of clarity. We claim that the existing notations of both functional languages and 
type theory fall short of what dependently typed programming demands, but also 
of what it can supply—a language of derived forms, rich, intuitive and extensible. 
Type theory offers the motive, the methods and the opportunity to ask anew what 
functional programming can aspire to be. We barely scratch the surface in this 
paper—nevertheless, we hope to engage your enthusiasm and your imagination. 


1.1 Background 

We start from a type theory with inductive families of datatypes (Dybjer, 1991), 
essentially Luo’s UTT (Luo, 1994), as implemented in Oleg —the first author’s 
adaptation (McBride, 1999) of Pollack’s proof assistant Lego (Luo & Pollack, 1992; 
Pollack, 1995). This type system is strongly normalizing (Goguen, 1994) and hence 
typechecking is decidable. An important and distinctive feature, which we expand 
upon below, is that inductive families embrace data structures richer than those 
available in other candidate languages for dependently-typed programming such 
as DML (Xi, 1998), or Cayenne (Augustsson, 1998): the former supports compile¬ 
time enforcing of finer well-formedness constraints on data which is nonetheless 
only Hindley-Milner typable; as to the latter, we explore an example not readily 
expressible in Cayenne—well-typed A-terms over simple types—in Section 7. 

Datatypes in UTT come with no intrinsic notion of pattern matching, by contrast 
with systems like ALF (Coquand, 1992; Magnusson, 1994). Primitive computation 
on datatypes is provided via ‘elimination operators’ (the ‘introduction operators’ 
being constructors), which behave operationally like primitive recursors, but have 
types which state structural induction principles. 

For example, the elimination operator for the natural numbers has the following 
type—compare the Hindley-Milner type scheme for primitive recursion: 

N-Elim : VP:N-% *. N-PrimRec : VT:*. 

P 0 f ' T 

(\/k:KPk > P(sk)) (X > T > T) £ 

Vn:N. Pn N -> T 

Observe that N-Elim delivers an inhabitant of a dependent function space, in 
this case Vn : N. P n. This allows us to specify, via an arbitrary program P, the 
‘motive’, different outcomes intended for different values of n. Learning more about 
n can change the things we are able to do with it, hence we can express numerically 
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indexed operations such as matrix multiplication. By contrast, N-PrimRec’s type 
allows no connection between the number and the purpose it serves. 

The arguments of N-Elim which explain each case also have more informative types 
than in the Hindley-Milner version. We call these arguments methods —where 
the vernacular speaks only, somewhat weakly, of ‘base’ and ‘step’ cases, without 
naming ‘the argument for such a case’—because they describe how the motive is to 
be pursued, depending on the value of n . Method types document explicitly the 
values for which we use them—a possibility only when types can depend on data. 

A key point of this paper is that the types of eliminators give an abstract interface 
to pattern analysis, whatever the actual patterns are. For example, the trichotomy 
principle can be seen as an operator eliminating two natural numbers: 

N-Compare : VP : N -> N -> *. 

(\/x,y:N. P x (z + s?/)) -> 

(Vi:N. P x x 

{Mx,y:N. P (y + sx) y ) 

Vm,n:N. Pm n 

We will show in Section 4 below how to use such operators in general, and in Sec¬ 
tion 6 how to construct (a variant of) N-Compare, which we may then use to 
define functions which in ordinary programming would be computed by a combi¬ 
nation of a boolean test and subtraction, where this operation is rendered safe to 
perform by the outcome of the test. 

Elimination operators are first-class values, and their types are sufficient on their 
own to document their usage in programs. Hence they may be abstracted in signa¬ 
tures which hide their representation without further ado. Moreover, as we shall see 
below, for the class of datatype families which we consider, certain distinguished 
elimination operators may be defined automatically. 


1.2 Outline of the rest of the paper 

Section 2 describes the basic type theory in which we work, augmented with a 
concrete syntax for programming. This is then explained by elaboration into an 
extension of the basic type theory which uses labels in terms and types to correlate 
the usage of a concrete syntax program with its elaboration. 

In Section 3 we focus upon the language of inductive families and their proper¬ 
ties. We identify a taxonomy of possible type dependency in case analyses through 
consideration of a running example based on heterogeneous association lists. 

In Section 4 we give a technical characterization of eliminators, together with 
the -t= (‘by’) construct which supports their use, whether primitive or user-defined. 
We discuss in depth the method by which we exploit elimination with equational 
constraints to explain the notion of patterns, as well as arbitrary structured decom- 
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position, on the left-hand sides of program definitions. In particular, we consider a 
useful derived form for dealing with structural recursion. 

In Section 5, we discuss the general situation of decomposing the results of sub¬ 
computations. Our | (‘with’) construct supports this, generalizing pattern guards 
to the dependently-typed setting. This notation retains economy of expression, but 
also allows delicate type distinctions to be made during case analysis: without it, 
we would need explicit helper functions with much more complex type signatures. 

Although elimination operators are higher-order functions, Section 6 introduces a 
first-order programming idiom for constructing and working with them—this is our 
account of views. 

In Section 7, we conclude our technical discussion with a large example: a type- 
checker for simply-typed lambda calculus with explicit type labels—‘Church-style’ 
(pre-)terms in Barendregt’s terminology (Barendregt, 1992). The program takes 
the form of a view of pre-terms as being either well-typed or containing an error. 
The implementation of this view is a proof that typechecking is decidable. 

In an epilogue, we discuss our findings and future work. 


1.3 Some history; some culture 

Our background is mainly in the field of interactive theorem proving in type theory, 
using the Lego/Oleg system. Consequently, the original draft of this paper had a 
very different emphasis: firstly, we focused on supporting an interactive method of 
programming. Indeed, while Oleg does not directly support the notations described 
in this paper, it does provide the tactics which inspired them—and which translate 
them into raw type theory. We developed all our examples interactively using these 
tactics. 

Secondly, and perhaps more seriously, it was motivated from the ‘logical’ perspec¬ 
tive on type theory. Regardless of the merits of this viewpoint, “dependent types” 
scarcely approached “practical programming” in terms of contributing to a dia¬ 
logue between communities. This is not a new phenomenon: a good illustration lies 
in the papers by Bird and Paterson, and Altenkirch and Reus, each writing about 
the type of de Bruijn A-terms, as a nested type in (Bird & Paterson, 1999), and as 
an inductive family in (Altenkirch & Reus, 1999). The two share but a single com¬ 
mon reference—Wadler’s “Theorems for Free!” (Wadler, 1989). Would that more 
researchers had Wadler’s ability to speak to both communities with equal effect. 

Likewise, though we were inspired by Wadler’s original proposal for views, we had 
worked in ignorance of subsequent elaborations of that idea and related develop¬ 
ments, not least Peyton Jones’ (1997) note. Quite independently, we had arrived at 
essentially the same formulation, but motivated by considerations of typing, rather 
than evaluation. Rod Burstall used to say to us that “Proofs are harder for stu- 
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dents to understand than programs, because once you’ve obtained a proof, it isn’t 
obvious what to do with it, or what it means to run one,” in spite of what Curry- 
Howard might lead one to believe. Our experience teaching students is that only 
by connecting patterns to the types which give rise to them, can the computational 
meaning and use of pattern matching be fully grasped. 

Acknowledgements We gratefully acknowledge the support of the EPSRC, with 
grants GR/N 24988 and GR/R 72259. We also thank the organisers of Dagstuhl 
seminars 01141, “Semantics of Proof Search”, and 01341, “Dependent Types Meets 
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ideas. Healf Goguen had important, and early, influence on this work supervising 
the first author’s PhD. We have received much good advice from the anonymous 
referees on how to improve this paper and from our colleagues, especially Randy 
Pollack. At a late stage, Thorsten Altenkirch and Roland Backhouse helped us out 
of a tight spot with coffee and printing facilities. Our main debt, however, is to the 
programmers who have inspired us: Rod Burstall, Fred McBride and Phil Wadler. 


2 Dependent type theory for functional programming 

This section introduces the functional core of the type theory in which we work— 
Luo’s UTT (Luo, 1994), extended with local definitions as in (Luo & Pollack, 1992; 
Pollack, 1995; McBride, 1999)—together with a concrete syntax for programming. 
The core language of UTT is summarised in Figure 1. We expect readers familiar 
with type theory to find its technical content largely unremarkable. The notation 
we employ here is not standard, being orientated more towards programming, but 
we hope it is nonetheless clear. For functional programmers with less prior exposure 
to this subject matter, we cannot expect to fill in all the blanks, but we hope that 
we provide enough of an introduction to give access to the ideas in this paper. 

Type theory’s key novelty for the functional programmer is the generalization from 
simple function spaces S —> T to dependent function spaces \/x : S. T. Here T 
may involve x, making the return type of the function depend on the value of the 
argument. We may still write S —» T if T does not contain x. Dependency allows 
operations on ranges of types, selected by a prior input, such as C-style printf (Au- 
gustsson, 1998), or the generic ‘fold’ for every concrete Haskell type (Altenkirch & 
McBride, 2002). It also makes type theory an expressive logic. 

Functions themselves are introduced by A-terms and applications compute just 
by /3-reduction. As we have local definition (let x <-» s : S.t), we dispense with 
substitution in the presentation. Definitions are not recursive—the s must exist 
before x is bound to it. Under the let re s : binding, x has type S and reduces 
to s by ^-reduction, and the binding itself will vanish when x no longer occurs in 
scope: we call this 7 -reduction —7 for ‘garbage’(c/. (Severi & Poll, 1994)). 

UTT has no special treatment of polymorphism, but we may V-quantify over types 
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Fig. 1. Luo’s UTT plus local definition (functional core) 
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(and other higher-kind objects). There is no danger of paradox—types are collected 
in a cumulative hierarchy of universes individually closed under V, each inhabit¬ 
ing and embedded in the next. These level subscripts can be managed mechanically 
(Harper & Pollack, 1991), so we shall freely omit them. 

Additionally, implicit syntax, a very useful mechanism also due to Pollack (Pol¬ 
lack, 1992), allows us to omit arguments to functions, where they may be inferred by 
unification. We mark in the concrete syntax for dependent function types whether 
the argument is to be supplied or omitted by default, writing V T ,s T to indicate 
the latter. We do not demand complete mechanical inference and indeed we may 
override it—if / : <,• T, we may still write f s to supply the argument s ourselves. 

The core language is regulated by a system of mutually inductively defined judg¬ 
ments, of which the first (typechecking) and third (conversion) contain the 
most interest from a programming point of view: 


| T h t : T | ‘t has type T in context T’: terms t are typechecked with respect to 
a context which contains (at least) the declarations x : S or definitions x i-» s : S 
of every variable which may occur free within t; 

| T h valid ] T is valid’: only those contexts T make sense, whose declarations give 
variables legitimate types and whose definitions are type-correct; 

| T h ScT] ‘S is convertible to T in T’: UTT is a a computational theory: its 
types may contain and are identified up to conversion; conversion is the usual 
equivalence closure of a reduction relation | T h s ^ 11 , generated by congru¬ 
ence closure from a number of specified one-step contractions; embraces /3- 
reduction, as well as other rules detailed below; we do not consider a-conversion 
explicitly—treatments include (McKinna & Pollack, 1999); 

| T h S A T | cumulativity polices embedding between universe levels. 


This system has a number of very strong meta-theoretic properties: all programs 
terminate, so conversion is decidable, hence so too are cumulativity, validity and 
typechecking (Luo, 1990; Goguen, 1994; Pollack, 1995). 

Remark on meta-notation and meta-operations 

In addition to the above properties of the type theory, we also require a num¬ 
ber of meta-operations. For example, JJ- t denotes the unique normal form of t. 
We typically present these in ‘functional’ style, writing equations in the form 
definiendum => definiens, employing ‘where’ clauses, ‘if-then-else’ etc. 

Inspired by de Bruijn’s ‘telescopes’ (de Bruijn, 1991), we manipulate sequences of 
bindings and of arguments, writing sequences of terms as vectors t (empty vector 
e), and iterated applications as / t. Contexts, denoted by Greek capital letters, may 
stand for multiple bindings in V-, A- and let-expressions. That is, we write VA. T 
for the dependent function space formed by iteratively ‘discharging’ A over T 
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V- . T => T 

VA; x : S. T => VA .Vx:S. T 
VA;irts:S 7 => VA. letxi->s:S.T 

Functions AA. t and iterated definitions let A i-» s.t are accordingly abbreviated. 
Successive bindings with the same type, e.g. m:N; n :N, are abbreviated as m,n: N. 
Finally, A may stand for the vector of its declared variables: if T b / : VA.T, 
then T; A b / A : T, even if A contains definitions. (End of remark). 

By the Strengthening Lemma (Luo, 1990; van Benthem Jutting et al, 1994), any 
well-typed term T b t : T arises from a minimal subcontext of T, that is, there 
exist contexts Y v . F,, satisfying: 

• T 1 C T minimal such that T* b t : T; 

• r 4 ; Tj is a permutation of T ; 

• r 4 ; F t b j if and only if T b J, for any judgment J. 

We shall make frequent use of this fact in the sequel. Indeed, such a context splitting 
(TbF,) may be computed as strengthen^, t, T), a meta-operation defined as 
follows, where fv(AF) denotes the set of variables free in X: 

STRENGTHEN^, t, T) ==> ( , •) 

STRENGTHEN (a: : S] r, t. T) 
where strengthen(F, t, T) 

=> if x b Fv(r J ) .JF\'(/.) L'F\'(7’) 
then {x : S'^Yt) 
else iT l ,x : S]T t ) 


2.1 Concrete Syntax for Programs 

In this section, we develop our notation for programming, summarised in Figure 2. 

We distinguish an extended expression language expr of this programming notation 
from the low-level terms of the underlying type theory. The category expr embraces 
the basic constructs of UTT, together with: 

• names for datatypes did and their constructors cid\ 

• a category Ihs which forms the left-hand sides of programs ; 

• a distinguished subcategory call of the Ihs, which comprises the allowable 
invocations of functions; 

• let notation, for local function definitions in expressions; 

• view notation, which will be explained in detail in Section 6. 

Top-level source code consists of a sequence of datatype declarations (of which 
more in Section 3 below) and definitions of new function symbols fid. These are 
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expr := vid | did | aid | call 
expr : expr 
\/vid : expr. expr * 

A vid : expr. expr expr expr 
let sig\fid] program, expr 


program := Ihs i-» expr 

Ihs ■$= expr {seq[program]} 
Ihs | expr { program } 


did := D | ... 

fid :=f l-.. 

call := fid expr * 

Ihs := call (| expr)* 

seq[thing] : = 

thing (; thing)* 


decl := data sia\did] where sia\cid\* 
| let sig [fid,] program 


sig[id] 


seq\vid : expr] 
id vid* : expr 


source := seq[decl\ 

Fig. 2. Concrete syntax for dependently typed programs 


introduced using let, which introduces a program with a specified type signature, 
given in natural deduction style: 

M R program 

where the syntax for programs departs from the traditional prioritized list of pat¬ 
tern matching equations. A program is a hierarchical structure, resembling those of 
Augustsson (Augustsson, 1985), which explains how calls to the function f should 
be executed—either 

• ‘by’ (-*=) invoking an eliminator; 

• or ‘with’ (|) the result of an intermediate computation added to the data 
under scrutiny; 

• or returning (h-») the value of a given expression once enough analysis has 
been done. ‘Returns’ Ihs i-» expr are leaves in the program structure. 

To aid readability in this paper, we adopt informal spacing and layout conventions 
which are inevitably more sustainable in RTgX than in ASCII. For example, we tend 
to show the hierarchical structure of programs by indentation rather than brackets 
and semicolons. Also, from time to time ( e.g. in the code for elem), we use vertical 
alignment to avoid the repetition of unchanged patterns from the Ihs of a program 
to those of its subprograms. We shall shortly show how programs determine the 
syntactic structure of their subprograms, and hence that some such convention can 
be implemented; we omit any further detailed discussion of such pragmatics. 


2.2 From Programs to UTT 

We explain the concrete syntax by elaboration into the underlying type theory, 
but to do this, we will have to augment the abstract syntax of UTT (see Figure 3). 
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term := ... label := fid term* (| term)* 

did | cid 
{label: term) 
call {label) term 
return term 

Fig. 3. Abstract syntax extensions for elaborating programs 


The underlying functional core must be extended with the datatype and constructor 
names, and to explain the distinguished calls and returns of functions, we introduce: 

• labels, label := fid term* (| term)* , which elaborate the category lhs\ 

• labelled calls, call {label) term , which associate a term with an elaborated Ihs-, 

• and their correspond returns, return term ; 

• and labelled types, {label-.term)-, 


This last construct {l: T) is used to label a type T with a function invocation l 
which, when executed, should return a value in T. We call these labelled types 
programming problems: they are solved by elaborating programs. 

Digression: programming problems in Lego To give an idea of our underlying 
motivation for labelled types, consider the following trick which you can play even 
in implementations of raw type theory such as Coq or Lego: suppose you want to 
implement the addition function (+) : N —> N —> N. You might start with this type 
as a top-level goal, and invoking N-elim, get back the subgoals 

? : N -» N 

? : N -» (N -» N) -» N -» N 

(the precise form of the interaction is not at issue here). Which instance of N is 
which? If you are unsure, it is rather easy to finish the job with a well-typed term 
which does not quite add up! Suppose instead that you rephrase the goal, as follows, 
via a defined function Plus which is vacuous in its arguments: 

Plus i-» \x, y :N. N : N -» N -» * 

? : Mx, j/:N. Plus x y 

If you normalize the goal, you can see it is just as before. With the unreduced goal, 
invoking N-elim now yields two subgoals 

? : MyM. Plus 0 y 

? : Mx : N. (Mz: N. Plus x z) '-4; My: N. Plus (sz) y 

Again, the normal forms of these subgoals are as before, but unreduced, they tell 
you exactly which N is which. Each subgoal shows you the ‘pattern’ to which it 
corresponds: in the base case, you are asked to solve the problem “what is 0 + yT r , 
and in the step case, “what is (sx) + yV , the inductive hypothesis shows you which 
are the allowable recursive calls, in this case x + z for any z (End of digression). 
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| context label label | 

T h valid r h l label T t : T T h Mabel T \- t : T 
r h f label r h It label V l \ t label 


| context h term : term \ 

T h l label r h T : 

T h (l:T) : *„ 

r h l label r h t : T T fe t : (l:T) 

T h return t \ (T.T) T b call (l) t : T 


| context h term ^ term | 

^ r F call (1) (return t) ~TTt 
Fig. 4. Typing and conversion extensions 


The vacuous arguments of Plus echo the use of phantom types in Haskell (Leijen 
& Meijer, 1999). These arguments enrich the descriptive power of the type, giving 
a more discriminating account of the purpose of its values—not just their represen¬ 
tation. In much the same way, we distinguish (/: T) and T, and use this to manage 
the process of typechecking and elaborating programs by stratifying their return 
types, labelling them with the function calls to which they correspond. 

The elaboration process relies on computation within labels, so the terms they 
contain must be well-typed—this is enforced by a label well-formedness judgment, 

| T h l label |. We give a very simple, and intuitively appealing, operational se¬ 
mantics to abstract call and return, by extending the reduction relation with p- 
reductions (p for ‘return’). The new rules are shown in Figure 4. 

Each program construct in our notation either refines problems into subproblems or 
solves them outright. For nontrivial problems, solving at a leaf is achieved by ‘filling 
in the right-hand side’ with the term whose value is to be returned. If every leaf 
is solved outright, then the program successfully elaborates. Such a model of suc¬ 
cessful elaboration lends itself to a fully-fledged account of type-directed interactive 
program development—with all the armoury of techniques currently employed in 
implementations of type theory at our disposal. We will return to this point later. 

We explain which high-level programs and expressions successfully elaborate with 
these new judgment forms: 

T lh l > 7] ‘left-hand side l elaborates to label V\ 

r Ih e > t : t \ ‘expression e elaborates to well-typed term t of type T’; 

T|A II - p t> t : (/: T) | ‘in global context T, and local context A of pattern bind- 
ings, program p elaborates to well-typed term t of labelled type (l : T)’; 

| T Ih d > A | ‘in context T, declaration d, elaborates to new context bindings A’. 
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\~context lh Ihs > label \ 

r lh £>l T tt- e > t : T T lh jpl T II 


| context lh expr > term : term] 


r ,,, r If e>l LOOKUP(l,r) =jp (f : ( l:T)) 

1 J r lh c > call (l) t : T 

[view] See Section 6 

Fig. 5. Elaboration of left-hand sides and expressions (edited highlights) 


mtext | context lh expr > term : (label: term) | 

r|A lh p > t : (l:S) r ; A h T 


[return] 


T|A lb p > t : (l:T) 

F; A lb i>l r ; A lb e t> t : T 


T|A lb l i-»- e > return t : (l:T) 

[by] See Section 4 [with] See Section 5 

Fig. 6. Elaboration of programs 


Interpretation We intend the judgments for elaboration of high-level programs 
and those of the type theory to be connected by the following soundness properties, 
which we conjecture follow by simple induction on the rules, together with the 
analysis we provide below of the elaboration rules for the various constructs: 


soundness for 

elaboration judgment 

yields 

underlying judgment 

labels 

r ib t\>i 


T h l label 

expressions 

r lh e > t : T 

=b 

T h t : T 

declarations 

r ih d > a 


T;A h valid 

programs 

r|A lh p > t : (l : T) 


r;Abf: (l:T) 


We hope to expand on such meta-theoretical treatment in future work; for now 
it suffices to observe that we obtain a naive operational semantics for programs, 
simply by taking normal forms of elaborated terms. 

The basic structural rules for left-hand sides and expressions are summarised in 
Figure 5; we only give selected instances of the rules for expressions, noting that 
we may incorporate into both forms the use of such notational conveniences as 
infix operators, Pollack-style implicit syntax and universe level inference, and the 
omission of domain types from binders where they can be inferred from usage. Of 
course, the real work is done by the remaining rules which explain the elaboration 
of the main programming constructs. 
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| context lh decl > context \ 

[data] See Subsection 3.2 
r lh V$. R > VA. T : * r| A lh p > t : (f A: T) 

I l£i f,,/ 1 ; ,, P > AA. /. : VA. (f A : T) 

Fig. 7. Elaboration of declarations 


We explain how the elaboration of a datatype declaration extends the context 
with new bindings, in Section 3. Likewise, we defer the discussion of ‘by’ until 
Section 4, as it requires some considerable analysis—this is the heart of our account 
of ‘structured decomposition on the left’. The elaboration rule for ‘with’ is explained 
in Section 5; in effect it constructs a ‘helper function’ with an extended label. 

Return from a call is straightforward to explain—rule [return], Figure 6; the elab¬ 
orated right-hand side is returned, packaged with the label which elaborates the 
left-hand side. Given t : T, the problem (/: T) is solved outright. 

The rule for declaring a function (see Figure 7) whose type V4>. R and body p 
successfully elaborate, binds a new definition into the context: a A-abstracted term 
whose type offers solutions to a class of programming problems—those whose labels 
represent calls to the function. For example, we may define snoc in terms of -H- 
(‘append’) as follows: 


let 


xs : List X x : X 
snoc xs x : List X 


snoc xs x i-)- xs -H- (x 


Here, the [return] rule demands that xs -H- (x :: []) : List X, to ensure that the 
equation solves the top-level problem (snoc xs x : List X). We could write all our 
programs this way by applying elimination operators in gory detail ‘on the right’. 
However, our notation exists to hide this detail, treating elimination ‘on the left’. 

Meanwhile, the [call] rule uses the partial (but terminating) meta-operation 
LOOKUP, to search the context for a variable which can be applied to deliver a 
solution to a programming problem with a given label—as delivered by definition. 
Similarly, whilst elaborating a recursive program via an induction principle, the 
local context will contain inductive hypotheses which ‘advertise’ the recursive calls 
they enable via labelled types, just as in our Plus example above. 

The lookup mechanism thus corresponds to a simple proof tactic—like Immed in 
Lego. We defer its definition until Subsection 4.1, by which time the structure of 
inductive hypotheses will have been made precise. For now, we can say that if T 
contains an elaborated definition, / h4 • • • : VA. (f A: T) and t : A, then certainly 

LOOKUP(f t, r) => (/ t : (f t:lf\et A i-» t. T)) 

Strictly speaking, this permits the elaboration of calls to defined functions only at 
exactly the arity in their signature. However, given that this arity has been specified, 
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it is a simple matter for the elaborator to handle a call at any arity: calls which are 
too long becomes applications of calls; calls which are too short get ^-expanded, 
A-abstracting the extra arguments required. 


3 Datatype families, eliminators and computation 


We declare families of datatypes in our language by giving type signatures for the 
type constructor symbol and for its data constructors, in the format 

data type-constructor-signature where data-constructor-signatures 
Simple monomorphic datatypes fit this pattern. For example, Unit and Bool: 
data tt— r-- where 


Unit : 


() : Unit 


data 


Bool : 


where 


true : Bool false : Bool 


Note that we write both type and data constructors sans serif. Signatures usually 
take the form of natural deduction rules: for each new symbol, we give the context 
which types its arguments above the line, and the type of the symbol applied to 
those arguments below. Examples include Cartesian products and lists: 


data 

data 


A,B : * 
Ax B : * 

X : j 


List X : x 


where 


where 


a : A b : B 
(a,b) : Ax B 


x : X xs : List X 
[] : List X xv. xs : List X 


List X is defined uniformly for any X and makes recursive references only to List X. 
Such a parametric declaration introduces a collection of datatypes each actual 
instance of which could, more tediously, be declared by itself. Families of datatypes 
(Dybjer, 1991) generalize parametric datatypes in two ways. Firstly, they are non- 
uniform: each data constructor targets a subset of the type constructor’s possible 
arguments—Dybjer calls these arguments indices when they are used in this non- 
uniform way. The So family mentioned earlier is a simple example: 


data where 


oh : So true 


Secondly, datatype families are mutually declared: a constructor for one subset of 
the indices may refer recursively to other such subsets. A suitable example is the 
family of heterogeneous association lists (‘a-lists’) with a specified domain of Labels: 


data 


Is : List Label 

HAL Is : * 


where 


hnil : HAL[] 


l : Label x : X h : HAL Is 
hconsx Ixh : HAL (/ :: Is) 


Here, hnil represents the empty a-list, with empty domain, and hcons adds a new 
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association, of the value x, of type X, with label l to an existing a-list h with 
domain Is, yielding an a-list with domain l::ls. Incidentally, we could easily require 
distinct labels by giving hcons an extra argument in So (not (elem l Is)). 

More generally, we permit datatype family declarations of this general form: 


data 


where 


_tl. 

Ci $1 : 


D ei 




c„$ n : D e r 


(t) 


The ei may differ from $ and each other, hence a Haskell/Cayenne-style 


data D x y z ... = Cl ... I ... I Cn ... 


will not serve. It is also why datatype families are so powerful. Correspondingly, 
case analysis on datatype families is rather more subtle than on simple datatypes. 
As with function type signatures, if V$. * >V0.* and V<Fj. D e, > VA,. D s t , then 
we obtain D : V0. * and c, : VA, : . D s). 

Remark For readability, we adopt the typographical convention that arguments 
with inferrable types need not be declared explicitly in a type signature’s premises— 
e.g. X : * and Is : List Label in the declaration of hcons. The missing declarations are 
inserted (with Pollack-style implicit quantification) among the elaborated context 
of arguments—we may subscript such an argument in the conclusion to determine 
where it goes. The signature for hcons elaborates to 

hcons : Vx : *.V; s:L ist Label- VZ: Label.X —» HAL Is —» HAL (l :: Is) 

This convention is implementable, by augmenting Pollack’s techniques, but the 
details are beyond the scope of this paper. (End of remark). 

Dependency in type families allows us to specify operations which enforce additional 
safety constraints by typing alone. For example, we can ensure that projections from 
an a-list apply only to labels in its domain: 

j et k : Label h : HAL Is p : So (elem k Is) 

— typeProj k h p : * 

^ k : Label h : HAL Is p : So (elem k Is) 

— valProj k h p : typeProj k h p 

We develop these operations as a running example: in Subsection 3.1 below, we 
explore the impact of dependent case analysis on the types which arise, and in Sub¬ 
section 5.1, the necessary coupling between intermediate computations and types. 
It is worth noting that there are other presentations of heterogeneous a-lists: we 
could index them by signatures in List (Label x *), or we could index signatures by 
domain, then a-lists by signatures. Indeed, this example takes its cue from problems 
originally encountered by Pollack in his codings of records in which later field types 
depend on earlier field values (Pollack, 2000). In all of these variations, we find the 
same problems—and the same solutions. 
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3.1 Working with datatype families 


In this section, we examine the interaction between case analysis and types—clearly 
nontrivial where a function’s return type depends on its argument, but still more 
interesting once datatype families become involved. Although not yet defined, we 
use our high-level notation to facilitate the discussion of our examples. Our purpose 
here is to examine the phenomena which arise in these programs, and which must 
be addressed in the design of any notation for them. 

For many simple programs, there is no interaction between case analysis and types, 
just as in standard functional programming. The familiar elem function contains 
two case-splits (on a List Label and on a Bool) neither of which affects types: 


lot k : Label Is : List Label 

— elem k Is : Bool 


elem k [] 
elem k (l :: Is) 


false 

k = l 

true i-»- true 
false 0 elem k Is 


Examining a value from an indexed datatype family is just as straightforward if 
its indices may vary freely. In a function with type V0. Vz : D 0. T, x could come 
from any constructor. If T does not depend on 0 or i, it will be unaffected. For 
example, we may compute a signature from a heterogeneous a-list: 


lpt h : HAL Is 

— hSig : List (Label x *) 

hSig hnil h-> [] 

hSig (hconsx l xh') h-> (/, X) :: (hSig h') 

Once a function space depends even on a simply-typed argument, case analysis 
can change the return type—a phenomenon new to functional programming. For 
example, given a value and a list of labels, we can compute the a-list binding each 
label to the value: 


1 t x : X Is : List Label 
— repeat x Is : HAL Is 


repeat x [] ; > hnil 

repeat x ( l :: Is) | A hcons l x (repeat x Is) 


The return type is indexed by the list, so the more we learn about the list, the more 
we know about what to return. In the [] case, the right-hand side must have type 
HAL []—hnil is the only candidate; in the step case, we need a HAL ( l :: Is), which 
suggests applying hcons 1 No constructor makes a HAL Is for unknown Is, but the 
more of Is we can see on the left, the more we can do on the right. 


When analysing values from a datatype family, constraining the choice of indices 
can rule out some cases. For example, we may shorten a nonempty a-list: 


« msirr~ t mrl mm *«)*>*' 


Why is there no case for hnil? Because there 


hnil can make an inhabitant 
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t Al ) t A2 ] [ Aa J [ A4 ] 

'• Ci Ai : D si \ '• C2 A2 : D S2 '• C3 A3 : D S3 \ 'C4 A4 : D S4 / 



Fig. 8. Constrained case analysis on a datatype family 


of HAL (/ :: ls)\ The type discipline ensures that we need only return values for 
constructors delivering elements whose indices lie in the subset under scrutiny. 
Further, a constructor may deliver suitable elements only from a portion of its 
domain. More generally, suppose we are writing a function f whose type is 

f : VA.Vz:D t.T 

by case analysis on x, where family D0 : ★ has constructors c. A,; : Ds). As Coquand 
observes in (Coquand, 1992), we need consider not the whole of D 0, nor even the 
whole of D t, but the intersection between D t and each of the D 3, in turn, as 
illustrated in Figure 8. 

In this hypothetical example, constructor c 4 is ruled out, just as hnil was for hTail, 
whilst every value returned by c 2 lies within D t. as was the case with hcons. 
However, we need only consider Ci Ai for a subset of its possible arguments—those 
Ai which make coincide with t —and similarly for C3. Moreover, for each c,, we 
need only consider instances of A—f’s arguments—which make t coincide with s,. 

This is a real departure for functional programming. Analysing one input x can not 
only deliver a restricted set of constructor patterns with some of their arguments 
already determined; it can also have a non-local impact, determining the values 
of other inputs on which the type of x depends. These instantiations may in turn 
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change the types of still other inputs, and possibly even the return type of the 
function. Examples of these phenomena are found in our definition of typeProj: 


let 


k : Label h : HAL Is p : 

typeProj k h p : 
typeProj k hnil p 

typeProj k (hcons^ l x h') p 


So (elem k Is) 

-t= So-case p 
k = l 

true 

false 


X 

typeProj k b! p 


Analysing the h : HAL Is argument gives two cases. In the case where h is hnil, we 
also learn—by typing, not testing—that Is is []. Hence p’s type in this case is really 
So false. The notation <= So-case p, introduced formally in Section 4, then invokes 
case analysis of p revealing no possible constructor— k cannot occur ip if], so there 
is no projection to define! 


The hcons case is still more interesting: the ‘information for free’ here is that the 
domain must be l:: Is' , and the tail h' : HAL Is'. Moreover, p : So (elem A; (l:: Is')). 
Now, elem k (l :: Is') is computed by testing the result of an intermediate call to 
k = l. Hence, when typeProj analyses k = l, it learns, again for free, yet more 
about the type of p. In the true case, this does not matter as label k has been found; 
in the false case, p’s type becomes So (elem A; Is') —exactly the prerequisite for the 
recursive call, typeProj kb! p. 


As you can see, some careful choreography is required to keep the testing performed 
by typeProj in step with the testing performed by its type. The ‘| k = V clause 
not only makes the result of the test available for analysis, it abstracts that result 
from the type of p. We give the exact details of its elaboration in Section 5. 

The valProj function carries out exactly the same analyses as typeProj: 


1 k : Label h : HAL As p : So (elem k Is) 

— valProj kb p : typeProj kb p 

valProj A; hnil p <= So-case p 
valProj k (hcons l x h') p I k = l 

true i-» x 

| false i-»- valProj k h' p 

This is no idle coincidence. Each case-split in valProj also instantiates the return 
type computed by typeProj. This is unremarkable in the hnil case: p’s type is 
empty anyway, just as before. For the hcons case, the subsequent analysis of k = l 
now delivers the value not only of the same test in the type of p, but also in the 
typeProj call, by which the return type is computed. Correspondingly, where x 
is returned in the true case, the return type really is X. In the false case, we must 
return an element of typeProj k h' p, which is exactly the type of valProj k h' p. 

We may summarize the interactions between case-splits and types observed in this 
section, by means of the following table. We categorize the examples, firstly by the 
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type of the argument being analysed and secondly by the degree of dependency in 
the function space where the analysis occurs. In each meaningful category, we name 
an example with the stated dependency and give the argument type. 


arg’s type 
dependency 

simple D 

free D $ 

constrained D t 

none 

on indices 
on arg itself 

[elem] List Label 
not applicable 
[repeat] List Label 

[hSig] HAL Is 
[typeProj] HAL Is 
[valProj] HAL Is 

[typeProj] So false 
[hTail] HAL (l:: Is) 
[valProj] So false 


Programming in Hindley-Milner systems never strays beyond the top left corner of 
this table. Recent experiments with polymorphic recursion on nested types (Bird & 
Meertens, 1998) begin to stray into the second row, although the indices affected are 
always type parameters rather than actual data arguments. Further, the uniform 
‘ data D 0 = ...’ style of family means that constructors can never be ruled out by 
analysing a constrained D t, nor can a particular choice of constructor tell us more 
about the indices t, as the intersection of the whole set 0 with t is just t itself. 

As we work towards the more powerful techniques and programs inhabiting the 
bottom right corner, we must confront a number of new issues: 

• How do we handle the effects of analysing one argument on other arguments 
and on types? 

• How do we handle the potential complexity of the intersections between non¬ 
trivial argument types D t and nontrivial constructor ranges Ds)? 

• How do we handle the impact on types of analysing the result of an interme¬ 
diate computation? 

The notation we introduce in this paper is a step towards addressing these questions. 
However, before we present the elaboration of the programming constructs, let us 
be precise about the presentation of datatype families in the underlying type theory. 


3.2 Elaborating data declarations 

These ‘ data ’ declarations (f) of Section 3 elaborate to context extensions by the 
rules in Figure 9; the new bindings declare the type- and data-constructors, together 
with the elimination operator D-elim, specifying which recursive computations 
are permitted over instances of D 0. The meta-operation hyps(P,A) computes 
the appropriate contexts of inductive hypotheses. Elimination operators acquire 
computational behaviour by extending the conversion judgment of the type theory 
with the ‘/.-reduction’ scheme. 

As observed in (Callaghan & Luo, 2000), /-reduction need not be implemented by 
naive pattern matching (as it is in Lego (Pollack, 1994)). A simple switch on the 
constructor C, , in the style of Augustsson (Augustsson, 1985), suffices for the safe 
execution of well-typed programs. 
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[data] 


| context lh decl > context \ 


T lh V$.* > V@.* : * 

T; D : V@.* lh V$i. D ft > VAj. D s, : * (1 ,< » <.n) 

for each a; : T in each A;, if D e T then for some u, TisDS 


T lh data _ - where — - ^ _ — ... — 

- D $ : * - Ci $1 : D ei c„ <P n 

>D:V@.*; Ci:VAi.D?i; c„ : VA„. D s n ; 

D-elim : Ve;i : D 0. 

VP: Ve; x : D 0. * . 

Vmi :VAi; HYPS(P, Ai). P (ci ft). 


Vm„ :VA n ; HYPs(P, A„). P (c„ ft). 
Px 


D ft 


targets 

motive 

methods 


where hyps(P, ■) 

hyps(P, r : Do; A) 
hyps(P, a : A- A) 


r':Pr; hyps(P, A) 
hyps(P,A) otherwise 


| context h term term~| 

^ r ; D-elim C h D-elim (q A,) P m m,: A, recs(P, m, A*) 
where recs(P, m, Ai) : HYPS(P,Ai) 

RECs(P, m, ■) ==> e 

RECs(P, m,r:D«; A) ==> (D-elim r P m); RECs(P, m, A) 
recs(P, m, a : A; A) ==> recs(P, m, A) otherwise 


Fig. 9. Elaboration of datatype declarations 


For N, declared by data ^- where ^^ 71 ' , we obtain 

’ J - N : * - 0 : N sn : N 

N : *; 0:N; s : N-0- N; 

N-elim : \/x:N. VP:N -h *. P 0 -> (Vn:N. P n -> P (sn)) P x 
N-elim 0 P mo nis mo 

N-elim (sn) P mom^^ m s n (N-elim n P mo ms) 

For all the examples in this paper, it is sufficient to ignore the possibility of higher- 
order recursive constructors and presume that all constructor argument types men¬ 
tioning D have form D u. Looser recursion regimes are now standard, as are mutual 
definitions, but we prefer not to complicate the presentation beyond what is needed 
to support the present paper. Moreover it suffices to treat datatype parameters (like 
the X in ListX) the same way we treat indices: a possible optimization is to abstract 
them once at the outside, rather than repeatedly in the motive and methods. 
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In this section, we develop the tools we need to deploy not merely the machine¬ 
generated elimination operators for datatype families, but any function whose type 
has a suitable shape. We say that a term e is a T| A-eliminator and we call its 
type a T| A-eliminator type if, for any 0, Aj, s), t, 

T;A h e : VP:(V0.*). (VAi.PSi) ->•••• -> (VA„. P s n ) -> Pt 
and T; 0 h valid 

and r ; P:(V0.*) ; A, h P Si : * (1 <i<n) 

It is this central definition, and its abstract characterization of the type-shape which 
drives the generalization of the primitive elimination operators in type theory. We 
call an eliminator’s first argument its motive —it shows what is to be gained by 
the elimination; the remaining arguments, we call methods —they show how the 
motive is to be achieved in each case. 


An elimination operator is a function / : VA. E in T, such that E is a T|A- 
eliminator type. We say that the A are /’s targets —they explain what is to be 
eliminated. Our definition thus includes, but is not restricted to the basic D-elim 
operators which come with datatype families. 

Note that the traditional presentation of induction principles (as in Subsection 1.1) 
orders the arguments: motive, methods, targets. We put the targets first, so that 
an elimination operator is a function from targets to eliminators. The -^-construct 
splits a programming problem into subproblems given an arbitrary eliminator. Of 
course, if T; A b x : D t, then D-elim x is a T| A-eliminator. 

The [by] rule explains how this splitting proceeds, directed by the eliminator’s type. 
It is shown, with other associated definitions, in Figure 10. The main work is done 
by the meta-operation split, computing the combinator g with which to recombine 
the elaborated subprograms. The account which we give here is a simplified version 
of those in (McBride, 1999; McBride, 2002), adequate for all the examples in this 
paper. Extensions covering more complex rules or more complex combinations of 
recursion are routine, but require more careful bookkeeping than is justified here. 

We shall explain what happens, with the help of a worked example—defining htail 


let 


h : HAL (l :: Is) 

hTail h : HAL Is 


hTail h ^ HAL-elim/i 

hTail (hcons l x h') h-» h! 


where (showing the indices, but omitting other inferrable information to 


space): 


HAL-elim ( ;..; s) h : VP: Y; s . HAL Is -> *. 

P {] hnil -A 

(Vx,h'-VJ,j,fc # . Pis’h' -> P (VO (hcons l x ¥)) 

P(i-.-.is) h 

For P, we need a motive such that P(i i,) h delivers an element of (hTail h: HAL Is). 



24 


Conor McBride and James McKinna 

Heterogeneous Equality 


a : A b : B _ P = W a = m : P a (refl a) 

a a=b b : * refl a : a = a =-elim q P m : P a t q 

\~context h term term | 

^ r h =-elim (refl a) P m ^ m 

let ? • ?' 4 ~ j4 p - S * subst jP 4 =-elim q (\ x , a - A_: a = x . P x ) 

— subst q P : P a P a 

let S ym q ° A ~a’ A — —a subst g (Ax: 4. x = a) (refl a) 

Simplification for a method 


\m : VA. t = t M] 

m' : VA. M : 
m i-> AA. Ag. m' A 
\m : VA. chalk s = chalk t — > M] 
pfh [m' :VA. s = f^Mi¬ 
ni !->■ AA. A q. INJECT q (m 1 A) 

\m : VA. chalk s = cheese t —> M ] where chalk ^ cheese 
==► m i-> AA. Ag. conflict q M 
\m : VA. x = s —> M] where x € DOM A, s £ DOM A 
m! : VA. s = x > M : 
m i-» AA. Ag. m A (sym g) 

[m : VA. c t = x —> M] where x -< c t 
==$• m i-> AA. Ag. cyclic g M 

[m : VA. t =t x —> M~\ where (A f , A X t \x \ T\ A*) f=P strengthen(A, t, T) 
^ [m : JJ.VA'; A*; x i-» t : T; A,. M] 

m i-> AA. Ag. subst g (Ax. VA*. M) (m' A* A?) A* 

[m : M] =► m 

Simplification for a context of methods 


r-i ==> • 

r^;m: M] => [tf]; [m : Af| 

Splitting a problem 

SPLIt(A, (l:T), E as VP: (V0. *). VTc P t) 

=► letP A0.VA. 0 = ? -t (/:T). 

(Af'F]. AA. Ae:£. e P 4- A (refl t) 

: Vr^l-VA. E&jj^l-.T)) 

| context | context lh expr > term : (Za6eZ: ferm) | 

T; A lh i>l T;A II - e> t : E for E a T| A-eliminator type 
SPLIT(A ,(l:T),E) ^ g : (VAi. (h ■ Si)) —>■ -* (VA*;. (4:5*)) 

VA. E -> (l:T) 

r|A< lh Pi > Si : (k- Si) (1 <i<k) 
r|A lh l*= e{p 1 ;...;p ib } > g (AAl si) ... {\A k .s k )At : (Z:T) 


Fig. 10. The [by] rule and related definitions. 
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The problem is that although P is applied here to a nonempty environment, it must 
still abstract over every environment, empty or not. This is an old problem for in¬ 
ductive theorem proving (for example in proving ‘generation lemmas’ (Barendregt, 
1992; McKinna & Pollack, 1993; McKinna & Pollack, 1999)) and for logic program 
transformation (Clark, 1978; Tamaki & Sato, 1984). How do we apply an induction 
principle (or an unfolding) to a constrained instance of a relation? 

Fortunately, there is also an old solution which has been exploited for many years, 
either by hand or mechanically, in these settings: transform ‘this constrained in¬ 
stance’ to l any instance which satisfies these constraints’, where the constraints are 
expressed by equations: 

If we could take P h-» A*, Xh !: HAL ks.ks = 1 ->■ (hTail h : HAL Is) 

then we would have P(i,,is) h ~ l:: Is = l:: Is -» (hTail h: HAL Is) 

This is what we need, at the cost of supplying a trivial proof. Meanwhile, the 
methods required would have types 

mi : [] = l:: Is -» (hTail h : HAL Is) 
m 2 : V x ,is'-Wjtp-Vh':HALls'. 

( Is 1 ’ =l::ls -> (hTail h : HAL Is)) -» 

V :: Is’ = l :: Is -> (hTail h : HAL Is) 

For the hnil case, mi, we have a false equation, hence the method should be supplied 
vacuously. For m 2 , we have an equation which implies that Is' = Is, and hence that, 
‘morally’, the exposed tail h! is an acceptable return. 

We can mechanize this idea in type theory, yielding the key technique for expressing 
high-level programs via elimination operators, hence we reprise it here. In order 
to do so, our type theory needs a suitable notion of equality—the heterogeneous 
equality shown in Figure 10. This presentation (McBride, 1999) is not yet standard 
in type theory: it allows the formation of heterogeneous equations between elements 
of any two types, and hence equations between vectors in a given context. We 
expand a = b as a context of equational constraints r/i : rq = b\: ...; qi- : au= 6*, 
and correspondingly, refl t as the vector refl t\ ;...; refl 4 • 

Crucially, however, the elimination operator (with K-reduction 1 ), which gives us 
that equality is a congruence, only applies to homogeneous equations: we may only 
substitute elements of the same type. It is not the operator which a data declaration 
would generate for =, but it still covers all canonical proofs of equations. 

Now, in the general case, we have a programming problem VA. (I: T) and an elim¬ 
inator with type VP: (V0. ★). V\f>. P t. The SPLIT meta-operation chooses 

P i-> A0.VA. 0 = t -> (/: T) 


1 k being a nod to those authors, who have studied an additional constant K which, for the usual 
inductively defined equality in type theory, yields power equivalent to our notion (Streicher, 
1993; Hofmann & Streicher, 1994) 
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Now (in scope of this definition) if we can find methods $ where 
^ is mi : VAi; A; Si = t. {l:T ); 

m n : VA„; A; s n = t. ( l: T) 
we will have 

AA. \e:E. e A (refl t) : VA. E {l: T) 

This is the general form of the technique we used in the hTail example, turning a 
particular t into equational constraints on a freely chosen 0 described above. The 
instantiated constraints characterize the intersections s) = t in which the indices of 
interest lie. Further, in any inductive hypotheses given by expanding P in A,, the 
equations give the conditions for making a recursive call. Quantifying over A within 
the motive P ensures that such inductive hypotheses are as liberal as possible. For 
hTail, the motive and the method types—now a little less tidy—are as follows: 

P i—» Xks- A h' : HAL ks. V Ms . V/*: HAL (l :: Is). 

ks = l:: Is |S h' = h || (hTail h : HAL Is) 

mi : V lJs .Vh:HAL(l::ls). 

[] = l::ls ->■ hnil = h (hTail h : HAL Is) 
m 2 : Vx.ta'.VP.s.VA'iHALJs'. 

(V Ms . V/i: HAL (Z:: ls).ls' = l :: Is -+ h' = h (hTail h: HAL Is)) 

V/./ s . V/f.: HAL (/:: Is). 

V :: Is = l ::ls <:F'hcons V xh' = h (hTail h: HAL Is) 

These methods m* will ultimately give rise to the subproblems solved by the sub¬ 
programs, but first they are simplified by first-order unification, as in (McBride, 
1998; McBride, 1999; McBride, 2002), and once again here. 

We present unification in Figure 10 as a meta-operation on a method binding, 

| m : M ~\, returning a context in which m still has type M, but may now be defined, 
either in terms of a simplified method m' : M' (with the equations reduced), or 
without further assumption (if the equations are demonstrably absurd). Each clause 
of the definition explains how to simplify a homogeneous equational hypothesis and 
thus takes the form \m : VA. s = t —> M~\ => ■ • In order to resolve ambiguity, 
we prioritize the rules from top to bottom and shorter candidates for A over longer. 
For reasons of brevity, we omit the explicit enforcement of homogeneity and the 
repetition of the input method’s type. 

The meta-operations INJECT and CONFLICT deploy proofs that a datatype family 
has the ‘no confusion’ property. Meanwhile, CYCLIC exploits the relevant family’s 
‘no cycles’ property: the condition x -< c t, (x is constructor-guarded in c t), holds 
if either x ~ U or x -< U for some i. These properties are derived automatically 
when each datatype family is declared: we do not repeat the construction here, but 
refer the interested reader to (McBride, 1999). 
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In the penultimate clause, STRENGTHEN is used to ensure that t is a suitable can¬ 
didate to instantiate x, whose binding must fall amongst those not needed to type- 
check t —this subsumes the traditional occur-check. Moreover, computing out the 
new definition instantiates x with t in the method’s label. 

What can we say about this unification algorithm? Our prioritization ensures that it 
is deterministic. Further, for methods [ m : VA. (I: T)~|, the usual induction (first on 
the number of non-equational hypotheses in A, then on the number of constructor 
symbols in the equations) shows that the algorithm terminates. 

We can readily iterate this process across a context of methods, pH/]. For hTail, 
we get something of the following form, with the hnil case solved outright, and the 
patterns in the hcons case reduced to those the subprogram requires: 

m =4 

mi \ij s . Xh. Xq:\\ = l :: Is. 

CONFLICT q (hnil = h -» (hTail h : HAL Is))] 
m' : \/xj s Vl,x-Vh:HAUs. 

(Vj’j,. Vh : HAL (1:: Is). Is' = l:: Is -+ h! = h 4%;. (hTail h : HAL Zs)|>4f 
(hTail (hcons l x h): HAL Is ); 

mf jffri.. subst.. mf] mf i-» .. subst.. mf; ml h-j- .. subst.. mf; 
m2 H .. INJECT .. ml 

Crucially, still binds every method in 'F, so the SPLIT operation used in the [by]- 
rule is well-defined: the combinator it computes just abstracts over the simplified 
problems, but passes the terms derived for the k <n unsimplified methods to the 
eliminator, solving the original problem. The [by] rule checks that these simplified 
problems are solved by the subprograms. 


4-1 Derived eliminators 

As has often been observed, many ‘obviously’ terminating functions do not directly 
fit the pattern of computation supported by D-elim operators—one step of case 
analysis, with recursion on the immediately exposed subterms. Some, such as the 
Fibonacci function, require access more than one step back down the course of 
values. Others, such as McBride’s dependently typed implementation of first-order 
unification (McBride, 2001), perform case analysis on a datatype family (the terms), 
but recursion on an index of that family (the number of unsolved variables). 

One remedy, certainly adequate for these two examples, is to follow Coquand’s 
suggestion and separate case analysis from recursion. Gimenez achieves this in 
Coq (Gimenez, 1994; Gimenez, 1998) by equipping the type theory with primitive 
Case and Fix constructs. The latter permits recursion on any constructor-guarded 
subterm ( c.f. the previous Section) of the argument it addresses. 

One does not need the full machinery of an extension by fixpoint constructs, how- 
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ever; the first author’s version of the same idea is to derive separate case analysis 
and recursion operators automatically, given the primitive elimination operator. 
The type of the case analysis operator is computed simply by discarding the induc¬ 
tive hypotheses from the primitive elimination operator: 

D-case : Ve; x : D 0. VP: (Ve; x : D 0. *). 

Vmi :VAi. P (cj Si). ... Vm n :VA ra . P (c n s „). P x 

The intrinsic action of /.-reduction on constructor-headed arguments is harnessed 
to account for constructor-guarded recursion, via a memoization technique: 

D-rec : V e ; x : D 0. VP: (V e ; x : D 0. *). 

(V©; x : D 0. D-memo P x -» P x) -> 

Px 

The predicate transformer D-memo computes a ‘course-of-values’ data structure 
storing a value in P y for every y structurally smaller than the given x. This 
structure is just a big tuple, computed by primitive recursion over x. We write 
D-memo informally in pattern matching style—these laws hold as conversions 
but the eliminator translation is straightforward. 

D-memo P (c. A,) ~ x (HYPS(D-memo P, A*); HYPS(P,A,)) 

where x(x 1 : Ti; ...; x n : T n ) denotes the Cartesian product T\ x ... x T n . We take 
x- to be Unit. For N, this gives 

N-memo P 0 ~>* Unit 

N-memo P (s n) ( JJ.N-memo P n) x P n 

The term justifying D-case is trivial to construct; that for D-rec is a little more 
complex^®# refer the interested reader to (McBride, 1999). We may use D-case a: 
repeatedly, or other means, to instantiate D-memo P x with constructor-prefixed 
terms, allowing it to unfold and reveal hypotheses for the guarded subterms. The 
meta-operation lookup must therefore be able to search these tuples in order to 
project out the solutions to the programming problems corresponding to recursive 
calls. Consider, for example, the Fibonacci function: 

let s# ' fib x <= N-rec x 

— fib x : N 

fib x <= N-case x 

fib 0 i —} 0 

fib (sx 1 ) <= N-case x' 

fib (sO) i-» sO 

fib (s(sm")) h-> fib x" + fib (sx”) 

Here, the initial <S= N-rec x will select the following motive and add the correspond¬ 
ing memo-structure to the context: 

P i —y \n.Mx. n = x . (fib x : N) 
memo x : N-memo P x 
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lookup(Z, r; X s : S) try unpack(-, (e, e), x, JJ-5) 
before lookup(1,T) 

LOOKUP(/, T; s : 5) -pf&K try unpack(-, (e, e), x, tyS) 
before lookup(1,T) 


where 

unpack(A ,(s,t),x,{l':T)) 


unpack(A, (s,t),f, Vx : 5. T) 

UNPACK(A, (s, t),qf,s = 
unpack(A, (s, t), xy, X x Y ) 


where (A i-A u) unifies l' with l and s with t 
r ; Ah>« I- x : (l:T) 

.•=sS> ( JJ.]et Ah>«.i : JJ.let A t-t u. (1: T)) 
where x € T 

'^#’ #<PACk(A;s: S, (s, t),f x, T) 

T) U.\p\(:k(A. (s-,s, t-,t), qf (refl s), T) 

=> try UNPACK(A, (s, t), snd xy, Y) 
before unpack(A, (s, t), fst xy, X) 


Fig. 11. The lookup meta-operation 


In the recursive case, x has been instantiated, and the memo-structure becomes 

memo x : N-memo P (s(si")) ((JJ-N-memo P x") x 

(\/x.x" = x —> (fiba;:N))) x 
(Vx.sx" = x -» (fiba::N)) 

So, LOOKUP must handle more than just the bindings, f i-4 : VA. (f A: T), 

yielded by the [let] rule; it must extract solutions from hypotheses tupled or con¬ 
strained by equations. We define it in Figure 11, giving only the patterns which 
lead to progress—if the match fails, so does the operation. 

For each binding in T, LOOKUP inspects the normal form of its type to check if it can 
match the required label l. The real work is done by the auxiliary meta-operation 
'DN.PACK(A, (s, t), x, X), which builds a candidate solution x, whilst accumulating 
a context A which must be instantiated, and a pair of vectors (s, t) which must 
be equal, for the candidate to succeed with type X. This X determines the search 
strategy: if it is V-quantified, try application; if it demands an equation, try a 
reflexive proof; if it is a pair, try each projection in turn. Eventually, if unpack 
reaches a candidate for a programming problem (V: T), it checks that l' subsumes 
l by unifying the labels and the accumulated constraints, then typechecking the 
instantiated candidate: we use ordinary first-order unification on normalized terms. 

For the fib example, lookup does indeed find that 

snd (fst memo x ) x" (refl x") : (fibx":N) 
snd memo x (sx") (refl (sz")) : (fib (sa:"):N) 

This definition of lookup is certainly adequate to unpack the solutions to pro¬ 
gramming problems exposed by D-case in the memo-structures installed by D-rec. 
However, the latter are just particular instances of the general notion of elimination 
operator, defined in Section 4, and could have been defined by a programmer us¬ 
ing D-elim; but since they may be generated automatically, we may take them as 
given. They capture an important class of allowable recursions; user-defined elimi- 
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nation operators which capture other interesting recursive call patterns have been 
considered elsewhere (McKinna, 2002) and remain the subject of ongoing study. 

Of course, htail and fib, as presented in full above, have rather more bulky code 
than functional programmers normally expect to write. Especially annoying is the 
fact that the calls we eventually write on either side already carry the evidence of the 
case analysis and structural recursion which explain them—constructor symbols. 

We can alleviate this problem somewhat by taking a combination of outer D-rec 
and inner D-case applications to be the default explanation of a non-empty block of 
programs wherever a single program is expected. The constructor patterns in these 
programs bound the depth of the splitting which can possibly produce them, and 
there are only finitely many ways to combine recursions lexicographically, hence 
there is at least a clumsy elaboration method. More sophisticated approaches may 
be found in (Cornes, 1997; Abel & Altenkirch, 2000). 

As a consequence of this defaulting strategy, we may suppress the -^-clause in htail, 
recovering our earlier statement of the program 

<• hT^T: ( HALb hTail (hcons Ixh') h' 

We may also remove all but the three equations from the program for fib, yielding 
the more familiar 


let 


n : j 

fibn : N 


fib 0 
fib (sO) 
fib (s(sn")) 


0 

sO 

fib n" + fib (s n") 


Indeed, in the general case, the only -case-splits which we must retain are those 
which yield no cases! The undecidability of type inhabitation obliges us to be explicit 
in such situations. In the absence of evidence in the form of a constructor pattern, 
which points to a particular argument type being empty, there is no basis on which 
to reconstruct the correct -case-term. Examples of this arise with the occurrence 
of So false in the hnil branches of typeProj and valProj. 


With the derived case analysis and recursion operators, and using this convention, 
our type theory can support—by elaboration into large and unreadable terms— 
every program admitted by Coquand’s proposed pattern matching language (Co- 
quand, 1992), as partially implemented in ALF (Magnusson, 1994). Such is the 
principal result of the first author’s PhD thesis (McBride, 1999), in which the orig¬ 
inal objective had been to dispense with eliminators in favour of pattern matching. 
With hindsight, we would recommend exactly the opposite. In our terms, Coquand’s 
system hard-wires splitting as if by D-case (with intersections computed by a uni¬ 
fication oracle) and presents recursion only as if by D-rec. 

We conclude this section with a simple example using a non-standard eliminator— 
the ‘target-first’ variant of N-Compare from the Introduction, of type 




The view from the left 


31 


N-compare : Vm,n:N. VP : N *4 St«§ *. 

(\/x,y:N. P x (x + sy)) *|> 

(Va::N. P x x ) f £•' 

(Vz, j/:N. P (?/+ sz) y ) s% 

P m n 

With it, we may define the ‘absolute difference’ function for N: 

let _ m,n : N _ 

— absDiff m n : N 


In the original spirit of pattern matching, a testing operation, comparison, has 
been safely and clearly combined with a selection operation, subtraction. We shall 
present more sophisticated examples in Section 6, where we develop an idiom for 
constructing non-standard eliminators by first-order programming. 


ibsDiff rn 

n 


N-compare 

absDiff x 

{x + sy) 

H4 

s y 

absDiff x 

X 


0 

absDiff (y + sx ) 

y 


sx 


5 Abstracting Intermediate Computations 

In this section, we introduce our analogue to the proposed pattern guard no¬ 
tation in Haskell (Peyton Jones, 1997; Peyton Jones & Erwig, 2000)—the with 
construct, Ihs \ expr {program}. Pattern guards allow an intermediate computation 
to be matched against a single acceptable pattern—if the subsidiary match fails, 
control passes to the next line of the program. For example, pattern guards provide 
a convenient way to unpack a recursively computed tuple: 

unzip [] =([],[]) 

unzip ((x,y):xys) | (xs,ys) <- unzip xys = (x:xs,y:ys) 

The basic function of ‘| e’ is to add the result of e to the collection of values under 
scrutiny on the left. Subsequent ‘matching’ comes from the <*= construct (implicitly, 
for standard -case operators) as usual. The effect is similar to defining a helper 
function over all the original ‘pattern variables’ together with the extra value, but 
the | is much more compact. With our layout convention, the above becomes: 

let , XyS : L ' St( ^ X B \ 

— unzip xys : List A x List B 

unzip [] J|| OJ 

unzip ((x, y) :: xys) I unzip xys 

| ( xs , ys ) (x :: xs, y :: ys) 

Once we have an intermediate value, we can consider more than one case of it, as in 
our version of elem. Haskell’s guards also reduce the tendency of programs which 
mix analysis of their arguments and intermediate values to degenerate into gangling 
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right-hand sides built by if and case. This function, counting the number of times 
a given tree occurs within another, shows but the tip of the iceberg: 

count s t = if s == t then 1 
else case t of 

Leaf -> 0 

tl t2 -> count s tl + count s t2 

To connect count’s arguments with the analysis on the right, we must observe the 
recurrence of t. Longer trails of repeated identifiers can easily become confusing, 
and certainly make it harder to tell at a glance what a program does. Here, even a 
Boolean guard is enough to reconnect the program, expressing its analysis clearly 
and concisely on the left: 

count st I s == t = 1 
count s Leaf = 0 

count s (tl t2) = count s tl + count s t2 

Even without special sugar for booleans or ‘fall-through’, our notation tabulates 
exactly the analysis performed: its ‘laws’ are as clear as its mechanism. 

s = t 
trues’J^* sO 
false h-» 0 

false i—^ count s t\ + count s t 2 

5.1 Abstracting from types 

Clarity notwithstanding, type dependency provides a second motivation for treating 
subcomputations on the left—their impact on types. We have already observed this 
informally with the elem, typeProj, valProj example. In order to connect the 
intermediate label tests in typeProj and valProj with the elem computations at 
the type level, we must abstract the tests from types as well as in the patterns. 

Our ‘with’ notation corresponds directly to an established technique in theorem 
proving—generalizing a goal by abstracting a subexpression, perhaps to strengthen 
an induction—as implemented by the Pattern tactic in Coq (Coq, 2001). Its elab¬ 
oration rule is shown in Figure 12. 

Using the meta-operation ABST (whose obvious definition as an inverse to substi¬ 
tution is omitted), the elaborator computes abstractions ( l x , on labels, and on 
contexts): these abstractions must be typechecked again, to ensure that replacing 
the elaborated term s by a variable has not compromised validity. The elaborator 
then constructs a helper function t from subprogram p, with an extended label—the 
main program calls the helper. The normalization of elem k (l :: Is), goes thus: 


— count s t : 


leaf 

(ti node f 2 ) 
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| context}context Ih expr > term : (labeT.term) | 

F;A II - t> l, r ; A II - e > s : S 
(A‘,A S ) strbngthbn(A, s, S) 
l x jj= abst(s, x, l s ) A x abst(s, x, A s ) 
r ; A a ;z : 5; A, h (Z* | x : T) : * 
r 1 A':t : S: A r lb p > t : (l x \x:T) 

[ J r|A Ih t \ e {p} > let x s : 5. return (call (L \ x) t) : {h : let z i-> s : S.T) 
Fig. 12. Elaboration of ‘with’ notation 

call (elem k (l:: Is)) List-rec ... 

call (elem k ( l:: Is)) return (call (elem k (Z:: Is) \ (call (k = l) ...)) ...) 
call (elem k ( l:: Is) \ (call (k = l) ...)) ... 

Correspondingly, when checking typeProj k (hconsx lxh) p \ k = l {...}, we start 
in the context 

k, l : Label; ... ; p : So (call (elem k (l:: Is) \ (call (k = l) ...)) ...) 

The term being abstracted, k = l, elaborates to the same ( call (k = l) ...) as is 
found in the type of p, so the subprogram is checked in the context 

k, l : Label; b : Bool; ... ; p : So (call (elem k (/:: Is) \ b) ...) 

Of course, the (k = l) call is abstracted from the term implementing the (elem ...) 
call, not just from the label. The subsequent analysis of 6 then allows the type of 
p to reduce further. The [with] rule gives the correct behaviour for valProj too, 
with abstraction from types working even harder to our benefit. 


6 Views: a programming idiom 


We have shown how abstracting an intermediate computation can have useful effects 
on types which depend on it. Case analysis on an intermediate value can also 
instantiate other patterns, if that value comes from a dependent family. In this 
section, we will illustrate this possibility, and show how it leads to an account of 
views, as proposed by Wadler (Wadler, 1987). 

It is a commonplace to equip a datatype with an ordering by implementing a binary 
operator returning an element of the enumeration Ordering, given by {It, eq, gt}. For 
N, we might write 


let 


m,n : N 

cmp m n : Ordering 


cmp 0 0 eq 

cmp 0 (s n) It 
cmp (s m) 0 i: > gt 
cmp (s m) (s n) i-» cmp m n 


We might then write the absDifF function, by inspecting the result of an interme¬ 
diate comparison: 
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let 


absDiff m n 


cmp m 
It 


A minor problem with this approach is that subtraction for N must return bogus 
answers when its second argument is the larger, in order to be a total function. 
More annoying is the fact that cmp has basically done the subtraction, but thrown 
the answer away. We could get around this by extending Ordering with difference 
information, but datatype families offer a more subtle approach. 


We can define a binary relation on N, with three canonical ways to show that two 
given numbers are comparable: 


data 


x,y : N 

Compare x y 


where 


It x y : Compare x {x + sy) 


eq x : Compare x x 
gt xy : Compare (y + sx) y 


Of course, every two numbers are comparable in one of these three ways. We can 
prove this by writing a program not much more complex than cmp above: 


compare x y 

: Compare x y 




compare 

0 0 

i-t eq 0 


compare 

0 (s n) 

It 0 n 


compare 

(s m) 0 

H- gt mO 


compare 

(s m) (s n) 


compare n 

i n 

compare 

(sx) (s(x + s y)) 


It xy 

H- It (sa:) y 

compare 

(sx) (sx) 


eq x 

eq (sa:) 

compare (s(|/ + sa;)) (sy) 


gt xy 

^ gt x (sy) 


What has happened here? For the base cases, it is easy to choose the appropriate 
constructor and its arguments. To compare s m with s n, however, we must ‘update’ 
the result of comparing m with n. hence we abstract it. But when we analyse a 
value in the datatype Compare m n, the arguments m and n become instantiated 
via the more informative constructor types. Inspecting an intermediate value has 
simultaneously told us more about the arguments from which it was computed. 

Analysing the value of compare m n now does the job of comparison, subtraction, 
max and min. We can now write 


let 


absDiff m n 

absDiff x (x + sy) 

absDiff x x 

absDiff (y + sa:) y 


compare n 
It x y 
eq x 
gt xy 


s V 
0 


sx 


The instantiated patterns now make quite clear the relationship between the inputs 
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and the outputs in each case. We emphasize again that the nonlinear and ‘+’ 
patterns do not require any ingenious operational behaviour: this is just a clearer 
way to write programs with basically the same operation as cmp. 

One can perhaps imagine other suites of related testing and selection functions being 
combined into more general analysis methods which deliver informative patterns: 
Haskell’s takeWhile, dropWhile, exists, all, ...each extract different function¬ 
ality from the common process of applying a test successively to the elements of a 
list until it succeeds (or fails). By giving that process a type which shows whether 
and how the list is split at a particular point, all of these functions, together with 
particular instances like elem, can be combined. We leave this as an exercise. 

The curious thing about compare m n is that once we have seen the patterns it 
yields for m and n, we no longer care about its actual value! The column of patterns 
with It, and so on, in absDiff is unnecessary noise. We can tidy up this idiom of 
testing and selection by examining case analysis over an inductively defined relation. 


6.1 From relations to views 


Wadler’s original views proposal (Wadler, 1987) fits well with the notion of user- 
defined elimination operators. He suggests that any (possibly abstract) datatype T 
may be equipped with a notion of pattern matching by defining an isomorphism 
between T and a datatype D: elements of T may be matched against or built by 
D’s constructors d].....d„. with the compiler inserting either component of the 
isomorphism, out : T -> D or in : D -» T, as required. Of course, there is no 
guarantee that in and out are either total or mutually inverse. In our setting, such 
a view may be expressed by replacing out with an elimination operator, 

T-view : Vt: T. VP :T -» *. 

fk:^. P (d, Kji -> 

(V£„:X n .P(d n £„))<^ 

P t 


where d, is the defined operation by which in interprets d,. Moreover, this type 
makes it clear that the t we put in is exactly the (d, Xi) we get out. 


It is easy to extract these eliminators from programs like compare above. To see 
how, examine the following two typed terms: 


N-compare m n : 

VP : N -> N *. 
(Vz, y. P x (x + sy)) -> 

(\/x. P x x ) -> 

(Vz, J/. P (y + sx) y ) ->• 

P m n 


Compare-case (compare m n ) : 

VP' : V m . V„.Compare m n -> •*. 
(\/x,y. P' x {x+sy) (It xy) ) -» 

(Vz. P' x x (eq x) ) -» 

(\/x,y. P' {y+sx) y (gtxy) ) -5- 

P' m n (compare m n) 
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| context lh expr > term : term~| 

T lh e > t : Dt 

T h D-case t : VP': (V*. D $ -4 *). ... (V$j. Pi (q Aj)) ... -4 P't 
" F lh view e > AP:V$. ★. D-case t fA*. A_: D $. P 3>) 

: VP:V$. *. ... (VA i.Psi) Pt 

Fig. 13. Elaboration of view 


These are almost the same, except that P' (on the right) takes an extra argument— 
the actual value from the Compare family. However, given a candidate motive P for 
N-compare, we can choose to instantiate P' with 

P' i-4 A mi „. A_:Compare m n. P m n 

This motive ignores its Compare argument and applies P to just the indices—the 
patterns we wish to keep. Observe then that the following judgment holds: 

AP: Vm, n: N. *. : VP : N -4 N -4 *. 

Compare-case (compare m n) (\/x, y. P x (z + s?/)) -4 

(A m , n . Ac: Compare m n. P m n) ( \/x. P x x ) -4 

(Vx,y. P(y + sx) y ) -4 

P m n 

We have just built N-compare! This construction is just what we mean by the 
concrete syntax view compare m n. Figure 13 shows the elaboration rule. 

There is a general recipe for establishing that a type T can be viewed via patterns 
Pi (over Ai) to p n (over A w X.-ifc readily extends to views of vectors of values. First, 
declare the relation 

— View—" tT : * Cl A7 T View- T Pl c„ A„ : View- T p n 

Second, write the covering function which shows that the view applies to all of T: 

lpf _ 

— view- T t : View- T t 

The view may be invoked in a function using the ‘by’ construct, 

Ihs <= view view-T t {programs} 

Indeed, as view t is meaningful for any t which belongs to a datatype, we can, in 
particular, use view to show the effect on patterns of the covering function’s own 
recursive calls. The actual code for compare in Figure 14 demonstrates this. 

What we have done is to explain non-standard ‘pattern matching’ via the refinement 
of index information which naturally accompanies the standard notion of case anal¬ 
ysis for datatype families, whilst hiding their actual constructors. We hope that the 
intermediate data structures we conceal when a view is invoked can also be elim- 
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compare m n 

: Compare i 

nn 


compare 

0 

0 

eq 0 

compare 

0 

M 

& It Oft 

compare 

(sm) 

0 

^ gtmO 

compare 

(sm) 

M 

4= view com 

compare 

M (s( 

s + s»)) 

It (sz) y 

compare 

M 

M 

i-> eq (sx) 

compare (s (y + ss)) 

(Sff) 

^ gtx (s y) 


Fig. 14. Comparison of natural numbers 


inated from compiled code by deforestation, a technique for which we also have 
Wadler to thank (Wadler, 1990). 

Wadler conceived his view notation as syntactic sugar for the insertion of mutually 
inverse coercions between datatypes, one of which admits pattern-matching, the 
other potentially abstract. The idea that a signature for an abstract data structure 
might hide its actual representation, but nonetheless export a notion of ‘pattern 
decomposition’, overcomes a genuine problem in the engineering of modular code. 
Programming with such programmer-definable patterns is exactly what the 4= con¬ 
struct permits, with the bonus that the interface is given by a type which can be 
required of an exported method in the usual way. Moreover, this type precisely wit¬ 
nesses the ‘no junk’ direction of the bijection: Wadler is forced by an inexpressive 
type system to trust the programmer. 

The presentation of views through datatype families also makes it easy to state a 
‘no confusion’ property, by stipulating that the covering function view- T delivers 
the only possible value in each case. We describe a view for which this property 
holds as unambiguous. To prove that such a property holds, we write a program 
with the following signature: 

_ x : View- T t _ 

— view-T-unique x : view -T t = x 


7 An extended example: typechecking 


This section shows views in action. We develop a typechecker for Church-style pre¬ 
terms in simply-typed A-calculus. Our language of simple type expressions has a 
base type and function spaces: 


data 


TExp : * 


where 


o : TExp 


S, T : TExp 
S => T : TExp 


Contexts are represented (back-to-front) by lists T : List TExp of such. We use 
a de Bruijn index (de Bruijn, 1972) representation of variables, rendered in type 
theory as usual by the datatype family Fin : N -» *, where Fin n has n elements. 
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data 


n : N 

Fin n : * 


where 


• : Fin s n 


i : Fin n 

t i ■ Fin s n 


Our source language, Exprn, is the datatype of well-scoped but untyped expressions 
with n free variables, the pre-terms. This is quite close to the representation of 
untyped terms in (Bird & Paterson, 1999). 


data 


n : N 
Expr n : * 


where 


i : Fin n 
eVar i : Expr n 


f, s ■ Ex P r » 
eApp fs : Expr n 


S : TExp t : Expr (sw) 
eLam S t : Expr n 


Our aim is to write a typechecker for pre-terms, relative to a given context T, of 
length |T|; we implement the typechecker for expressions in Expr |T|, by defining 
three views respectively: 


• for looking up variables in the context; 

• for testing equality of simple types; 

• for typechecking pre-terms. 

Each of these views has a similar flavour: they capture the extraction of structured 
data (like well-typed terms or error diagnostics) from less structured data (like 
pre-terms) by showing that the latter can be viewed as the forgetful image of the 
former. Let us warm up by considering variables. 


7.1 The find view 


We may define the membership relation of a list inductively as follows: 


data 


xs : List X 

In xs x 


x : X 


where 


_ i : In xs g 

• 1" if :: xs) x t * : In (x xs) y 


An element of In xs x encodes a reference to a particular i in a list xs. We think 
of such a reference as a de Bruijn index into a list, labelled by the x to which it 
points, which is why we have overloaded the constructors. We shall use In T S to 
represent variables of type S over contexts T in our definition of well-typed terms. 

There is an obvious forgetful map | i\ x from In to Fin, which strips the label. We 
usually overload such forgetful maps as | —|, superscripting what the map forgets, 
if we ourselves wish to remember it. 


let 


i : In xs x 
I*!* : Fin |a:s| 


+ M*f 


If we have an unlabelled index in Fin |a:s|, we can look it up in xs by ‘unforgetting’ 
the label. That is, we explain how every unlabelled index arises as the forgetful 
image of a labelled index, by means of the following view: 
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data 


xs : List X i : Fin |a;s| 
Find xs i : * 


where 


_ i : In xs x _ 

found x i : Find xs \i\ x 


let -r. —i-t-=—;-r find (x :: xs) • . & found x • 

— find xs i : Find xs i ; ' 

find (x :: xs) (j~ i) <= view find xs i 

(TH*)^ found x (j~ i) 

This program fragment shows how we use this view: 

check T (eVar i) <= view find T i 
(eVar|*| s )^ ••• 


7.2 The type of well-typed terms 


Now that we can represent typed variables, let us define the well-typed terms, in a 
similar fashion to (Altenkirch & Reus, 1999): 


data 


T : ListTExp T : TExp 
Term T T : * 


where 


i : In T 6* t : Term (S :: T) T 

var i : Term T S lam St: Term V (S => T) 


f : Term T (S => T) s : Term T S 
app / s : Term T T 


These constructors just give the typing rules in syntax-directed form. There is an 
obvious forgetful map from Term to Expr: 


let 


t : Term T T 

W ■■ Expr jfj 


|var i\ s h-> eVar |*| s 

lam S t >r m eLam 5 \t\ T 
|app/s| T 4 eApp |/| S ^ T |s|“ 


7.3 The eq? view 

Imagine we are in the process of typechecking an application. On one hand, we have 
a function, which we have checked has an =>-type: that is, we have some |/| S=!>T . 
On the other, we have an argument, which is some well-typed term | sl^ 4 . What we 
do not yet know is whether S and A are the same. How will we find out? 

We could compute the value of S = A, the usual Boolean equality test. If false, 
the application is ill-typed, so we can reject it. But if true, whilst we may know 
that — tests equality the typechecker just knows that S, A : TExp; true : Bool. A 
successful = test does not tell the typechecker that S and A are the same, hence 
we cannot yet build app / s. The trouble is that a Bool is a bit uninformative. We 
can remedy this by presenting equality via a view. 


As usual, we declare a relation 
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The positive cases of eq? 


— eq? S T : Eq? S T 


eq? 0 

0 


same 

eq? 0 

flfe ^ t 2 ) 

f-4 

diff ?i 

eq? (Si Wt Ti) 

0 

Ihfe 

diff ?2 

eq? (Si £ TO 

(S 2 TO 


view eq? Si S2 

eq? (S .=> TO 

(S TO 


view eq? Ti To 

eq? (S => T) 

(S => T) 

ssfS." 

same 

eq? (S => T) 

(S =>• T'\T) 

i-S- 

diff ? 3 

eq? (5 4 TO (S'\S => TO 


diff ?4 


Filling in the negative cases 


data '$■ ' where 

- Isnt S : * - 


isnto S 2 T 2 : Isnt o 


isnt=> Si Ti 

: Isnt (Si 

=> Ti) 

T' : 

Isnt T 


isntR T' : 

Isnt ( S <=> T) 

S' : Isnt S 

T 2 : 

TExp 

isntL S' T 2 : 

Isnt ( S 

=> TO 


T : Isnt S 
T\S : TExp 
S2T2 \ o 


isnt=> 5i Ti \ (Si =>• Ti) 


isntR T' \ (S => T) 


isntL S' T 2 \(S => T\) 


S2 =*• T 2 

o 

S => T'\T 
S'\S =*■ T 2 


Fig. 15. The equality view 


data 


S, T : TExp 
Eq? ST:* 


where 


same : Eq? S S 


T : Isnt S 

diff T : Eq? S (T\S) 


The first constructor is clear enough, but what is this Isnt S, and what is ( S\T )? 
The former is a type representing evidence of difference from S, and the latter is 
its forgetful map back to TExp (which binds more tightly than =>). We do not 
write \T\ S , to avoid clashing with the forgetful map for Term. There are many 
ways to define Isnt. One obvious candidate is to use existential quantification (or 
dependent pairs). 

IsntS 1 i-> |i: TExp. S = T -> _L (T,p)\S T 

Another possibility is to define Isnt by recursion on S. We shall declare it as a 
datatype family, but we defer the definition until after our first attempt to write 
the covering function, eq?. At the top of Figure 15, we write what we can without 
fully declaring Isnt. 

Now, we need elements of Isnt types in four places—two for ‘different constructors’, 
and two for differences left or right of =>. The easiest way to define Isnt is just to give 
it constructors for these cases, packing up exactly the information available where 
they are used. The constructor forms declared at the bottom of Figure 15 go in the 
‘holes in the program’ as indicated. Or rather, the constructor forms come from 
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data 




I : ListTExp e : Expr |r| 

Check T e : * 

t : Term T T err : Error Y 

term T t : CheckT \t\ T error err : CheckT |err| 


— check Te : Check Te 


check r (eVar i ) <^= 
check T (eVar |j| s ) i-> 
check T (eLam St ) 
check T (eLam 5 |t| T ) 
check T (eLam S | err|) 
check T (eApp / 
check T (eApp |/|° 
check T (eApp \f\ s= * 
check T (eApp \f\ S ^ 
check T (eApp | f\ s= * 
check T (eApp \f\ s ^ 
check T (eApp \f\ S ^ 
check T (eApp | err | 


view find T i 
term S (var i) 

•$= view check (5 :: T) t 
term ( S =k T) (lam S t) 
Kill error 

s ) -£= view check T / 

T s ) <= view check T s 

|s| ) <= view eq? S A 

T |s| S ) i-> term T (app/ s) 

T error ? 3 

T I err | error ? 4 

s error ? 5 


Filling in the negative cases 


data r . ListTExp w here 
- Error T : * - 

err : Error (5 1 :: T) 

L ' lj bodyE S err : Error T 

[?„] / : Term To _ a : Expr |F| 

L-2J notFunE/ s : Error T 

„ » / : Term T (S => T) s : Term T (A\S) 

mismatchE/s : Error T 

r? i f '■ Term T (5 1 =>• T) err : Error T 
[ ' 4j argE f err : Error T 

err : Errorr _ 3 : Ex P r l F l 

L-5J funE errs : Error T 


fe-i e : ErrorT 
M R : Expr |r| 

|bodyE S err\ • > 
eLam S |err| 
|notFunE / s| i-> 
eApp |/|° s 
|mismatchE/ s| i-)- 
eApp |/| S ^ T fiR?, 
argE/ err •> 

eApp |/| S ^ T | err| 
fcerr.| ■ > 
eApp | err| s 


Fig. 16. The typechecking view 


the holes in the program as indicated. The forgetful map is generated accordingly. 
We see no reason why, in an interactive setting, we cannot extract the ‘remainder’ 
family from the unsolved programming problems. 

We are now ready to write the typechecker. 
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7.4 The check view 

We define typechecking as a view Check T e on contexts and pre-terms, expressing 
any e : Expr |T| as the forgetful image either of a Term, or of an Error. Again, we 
shall defer giving the constructors of Error until we have identified the holes in the 
program check T e which establishes the view. At the top of Figure 16, we develop 
the algorithm as usual, by case analysis on e, followed by recursive calls to check: 

• in the eVar case, there is nothing further to do, as variables are well-scoped; 
it suffices to look up the type from the context, using the find view; 

• in the eLam case, we typecheck the body in an extended context; 

• in the eApp case, we successively check first the function, then the argument, 
and finally match the computed types using the eq? view. 

The view of each recursive call on check, yields two cases, according as typechecking 
succeeds or fails; in the case of success, the pattern lays bare precisely the data 
required for the next call. As with the equality view, we now choose constructors 
and define a forgetful map for Error with which we can fill in the five remaining holes, 
packing up the information exposed by each of the possible sources of typechecking 
failure—see the bottom of Figure 16. 

The function check is not just a program: it is a proof that typechecking is decidable 
for the pre-terms. It does not merely say ‘yes’ or ‘no’, but rather explains each pre¬ 
term as deriving, by a forgetful map, either from a well-typed term or an error 
term. Its type guarantees that the term being checked really is the term it is given. 
Its analysis is concisely stated and imposes the conditions for well-typedness (and 
its complement) just as they are expressed by the typing rules. 

Moreover, as its recursive calls show, it represents these two possibilities in a ‘pat¬ 
tern matching’ style, visibly delivering either a well-typed term which may be passed 
to an exception-free interpreter in the style of Augustsson and Carlsson (Augusts- 
son & Carlsson, 1999), or a useful error diagnostic. The latter locates the leftmost 
type error in a pre-term. It could easily be adapted to find every application of a 
well-typed non-function or mismatched application between two well-typed terms— 
useful information not only for error reporting, but also for type debugging and 
repair, as investigated by McAdam (1999). 


Epilogue 

The main discovery we have made in the light of this research is how little is known, 
not least by ourselves, about functional programming with dependent types. It is no 
longer credible to conceive of dependently typed programming merely as a means to 
relegitimize programs which were lost to us when we moved from untyped languages 
to the Hindley-Milner system. We take its inherent complexity as an opportunity , 
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rather than a problem, and in so doing, we see emerging a very different possibility 
for declarative programming, which we have barely begun to explore. 

This paper has introduced a specific programming notation on top of an existing 
type theory, and shown in detail, through examples and a skeletal formal definition 
which explains how the main constructs are translated, some of the power, as well 
as weight, that is available in this new world. We have extended the notion of 
‘pattern matching’ to embrace any user-definable structured decomposition of data 
on the left, including the use of, and interplay with, intermediate computations and 
result types. We have further related our work specifically to two proposals in the 
functional programming community for extensions to the classical notion of pattern 
matching, Peyton Jones’ pattern guards (1997), and Wadler’s views (1987). 

The former remarks that the potential uses of pattern guards are, can, and should 
be ubiquitous, as they allow “a useful class of programs to be written much more 
elegantly”. We would certainly argue that this is all the more surely the case in our 
setting—with the greater expressivity available with dependent types, that class of 
programs becomes much more interesting. And in our notation, we would argue, 
without any loss of that elegance. Neither we, nor anyone else for that matter, have 
even begun to exhaust the possibilities of programming in such a style. 

As to the latter, we have given a thorough analysis of how views may be pre¬ 
sented using dependent types, as well as variety of examples of views, and uses 
of views not previously considered in the literature. Our general picture allows us 
to consider partial and ambiguous views, to explore trade-offs between recursive 
and non-recursive views, as well as looking at termination proofs and varieties of 
recursion induction (Bove & Capretta, 2001). 

More generally, we take the explosion of power which dependent types bring to 
programming, as delineated in Section 3 as a cue to re-evaluate design choices 
about the language within which we express programs, the tools with which we 
construct programs, and the programs we choose to write in the first place. This 
includes reassessing the interfaces and implementations of standard data structures 
and algorithms, no less than any other programs. 

We believe that such new languages, tools and libraries as emerge in the future 
will also profit considerably from the experience gained in the wider domain of 
interactive problem-solving with dependent types. While we have downplayed that 
aspect of our research in this paper, our new analysis of the left-hand sides of 
functional programs is strongly rooted in logical considerations and the techniques 
which are supported by existing interactive proof assistants based on type theory. 
We intend in future work to elaborate on these aspects, and the contribution our 
notation may make to declarative proof. 

There is much work to do here in building such a future—in Durham, we have 
dubbed our programme of research Epigram, embracing language, meta-theory, 
implementation and applications. The first author’s experimental extensions to 
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Lego (1999; 2002) provided tactics for inductive proof supporting the construc¬ 
tions which underpin the [by] and [with] elaboration rules. These tactics are suffi¬ 
cient to develop the examples in this paper, but do not support a concrete syntax 
for programs as such. 

This paper lays the groundwork for a formal language definition for Epigram; 
we are now working on a new prototype implementation based on this definition. 
Clearly many interesting issues remain to be explored, not least at the run-time 
level, studying the operational behaviour of elaborated programs. 

In closing, we return to Wadler, crediting him with the insight that, by constructing 
views, we can and should choose to adapt our perceptions of data to match our 
conceptions of data. We are able to reify his views directly, by using dependent 
types, and by our treatment of the left. So hurrah for Wadler! Welcome to the new 
programming. 
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